Introduction
Monkeypox is a rare zoonotic disease that is caused by the monkeypox virus (MPXV) from the Orthopoxvirus genus, which also includes the variola virus (causative agent of smallpox)1,2,3. Although the natural reservoir of MPXV remains unknown, animals such as rodents and non-human primates may harbour the virus, leading to occasional spill-over events to humans1,2,3. MPXV is endemic in West and Central African countries and the rare reports outside these regions are associated with imports from those endemic countries1,2,3,4. Still, we are now facing the first worldwide outbreak without known epidemiological links to West or Central Africa1, with 219 confirmed cases already reported worldwide, as of May 25th 5, since a first confirmed case on 7 May 20224. Several measures are being taken by international health authorities to contain the transmission1, including the potential use of the smallpox vaccine for post-exposure prophylaxis of close contacts. The virus can be transmitted from one person to another by close contact with lesions, body fluids, respiratory droplets and contaminated materials1,3, but the current epidemiological context poses some degree of uncertainty about the viral transmission dynamics and outbreak magnitude.
International sequencing efforts immediately began to characterize the outbreak-causing MPXV, in order to identify the origin and track its worldwide dissemination. Genome data will also inform about the virus evolutionary trajectory, genetic diversity and phenotypic characteristics with relevance for guiding diagnostics, prophylaxis and research.
Here, we report the rapid application of high-throughput shotgun metagenomics to reconstruct the first genome sequences of the MPXV associated with this worldwide outbreak, providing valuable genomic and phylogenetic data on this emerging threat. The analysis includes a first outbreak-related MPXV genome sequence, publicly released on May 20th, 2022 by Portugal5, as well as additional sequences released on NCBI before the 27th of May 2022, 15 sequences in total [Portugal (n=10), USA (n=1), Germany (n=1), France (n=1), Switzerland (n=1), Slovenia (n=1)] (Supplementary Table 1 and 2). The rapid integration of the first sequence in the global MPXV genetic diversity (Figure 1) confirmed that the 2022 outbreak virus belongs to the West African (WA) clade. MPXV from the WA clade is most commonly reported from western Cameroon to Sierra Leone and usually carries a <1% case-fatality ratio (CFR), in contrast with viruses from the “Central African” (CA) (or “Congo Basin”) clade, which are considered more virulent with a CFR >10%7,8. All outbreak MPXV sequenced so far tightly cluster together (Figure 1), suggesting that the ongoing worldwide outbreak has a single origin. The 2022 outbreak cluster forms a divergent branch descendant from a branch with viruses associated with the exportation of MPXV virus in 2018 and 2019 from an endemic country (Nigeria) to the United Kingdom, Israel and Singapore9,10, with genetic linkage to a large outbreak occurring in Nigeria in 2017-201810 (Figure 1). Given these findings and the worldwide epidemiology of MPXV, it is likely that the emergence of the 2022 outbreak resulted from recent importation(s) of this MPXV variant from an endemic country. Still, one cannot completely rule out the hypothesis of a prolonged period of cryptic dissemination in humans or animals in a non-endemic country (e.g., after the reported 2018-2019 importations). Silent human-to-human transmission seems less likely considering the known disease characteristics of the affected individuals, usually involving localised or generalised skin lesions1. Cryptic transmission in an animal host in a non-endemic country followed by recent spill-over event is another hypothesis, even though, again, this would be somehow surprising. In this context, the identification of the outbreak index case can be challenging, considering the expected incubation period of 5-21 days3 and the fact that multiple cases were confirmed in several countries in a three week period1 since a first report on May 7th by the UK1. For example, although this case has been hypothesized as the index (travel from Nigeria to the UK on 3-4 May 20221,3), the earliest date of collection in Portugal is also May 4th6. Altogether, one cannot discard the existence of more than one introduction from a single origin, with superspreader event(s) likely triggering the rapid worldwide dissemination.
Remarkably, the 2022 MPXV diverges a mean of 50 SNPs from the related 2018-2019 viruses (Figure 1 and 2), which is far more (roughly 6-12 fold more) than one would expect considering previous estimates of the substitution rate for Orthopoxviruses (1-2 substitutions per site per year)11. Such a divergent branch might represent a recent evolutionary jump. Of note, among the 46 SNPs (24 non-synonymous, 18 synonymous, 4 intergenic) (Supplementary Table 3) separating the 2022 MPXV outbreak virus from the closest reference sequence (MPXV-UK_P2, 2018; GenBank accession #MT903344.1), 3 amino changes (D209N, P722S, M1741I) occurred in the immunogenic surface glycoprotein B21R (MPXV-UK_P2-182)12. Serological studies have previously indicated that the monkeypox B21R protein might be an important antibody target with several key immunodominant epitopes12. As discussed after our release of the first 2022 MPXV outbreak sequence13, fine inspection of the mutation profile of those 46 SNPs further revealed a strong mutational bias, with 26 (14 non-synonymous, 10 synonymous, 2 intergenic) and 15 (9 non-synonymous, 16 synonymous) being GA>AA and TC>TT nucleotide replacements, respectively (Figure 2, Supplementary Table 3). This (hyper)mutation signature might suggest the potential action of apolipoprotein B mRNA-editing catalytic polypeptide-like 3 (APOBEC3) enzymes in the viral genome editing14. In fact, APOBEC enzymes can be upregulated in response to viral infection, being capable of inhibiting a wide range of viruses by introducing mutations through deaminase and deaminase-independent mechanisms14,15. In some circumstances (e.g., lower levels of deamination), APOBEC-mediating mutations might not completely disrupt the virus, thus increasing the likelihood of producing hyper-mutated (but viable) variants with altered characteristics (e.g. HIV immune escape variants)14. The repertoire of APOBEC3 enzymes depend on the host species and different enzymes display differences preferences for the nucleotide (or motif, such as dinucleotides or tetranucleotides) to be mutated14,16. For instance, the GA>AA and TC>TT nucleotide replacements observed in the 2022 outbreak MPXV had been also found to be the preferred mutational pattern of human APOBEC3A enzymes (expressed in keratinocytes and skin) during genetic editing of human papillomavirus (HPV) in HPV1a plantar warts and HPV16 precancerous cervical biopsies17. Whether the excess of mutations seen in the 2022 MPXV is a direct consequence of APOBEC-mediating genome editing cannot be disclosed at this stage. Similarly, one cannot conclude at this stage whether the observed evolutionary jump triggered MPVX adaptive evolution towards an altered phenotypic features, such as enhanced transmissibility.
Further phylogenomic analysis revealed the first signs of microevolution of this virus during human-to-human transmission throughout the outbreak. Among the 15 outbreak sequences analysed here, we detected the emergence of 15 SNPs (8 non-synonymous, 4 synonymous, 2 intergenic and 1 stop gained) (Figure 2; Supplementary Table 4). Notably, all SNPs also follow the same mutational bias described above, including eight GA>AA (6 non-synonymous, 2 synonymous) and seven TC>TT (2 non-synonymous, 2 synonymous, 1 stop gained and 2 intergenic) nucleotide replacements. This might suggest a continuous action of APOBEC during MPXV evolution. Among the 7 phylogenetic branches directly descendant from the most recent ancestor of the MPXV outbreak variant (Figure 1), we identified a sub-cluster (supported by 2 SNPs) of two sequences (PT0005 and PT0008, each with an additional SNP) that also share a 913bp frameshift deletion in a gene coding for an Ankyrin/Host Range (MPXV-UK_P2-010). Gene loss events were already observed in the context endemic MPXV circulation in Central Africa, being hypothesized to correlate with human-to-human transmission18.
Other clues of ongoing viral evolution (and potential human adaptation) could be retrieved from our data. Most emerging SNPs in sequences from Portugal were not 100% fixed in the viral population (frequencies: 75%-95%), supporting the existence of viral intra-patient population diversity. Further inspection of minor intra-patient single nucleotide variants (iSNVs) in Illumina samples lead to the validation of 11 non-synonymous minor iSNVs (across 5 samples), again most with the “APOBEC signature” (Supplementary Table 5). Notably, among the targeted proteins, we highlight a few proteins potentially interacting with host immune system, such as an MHC class II antigen presentation inhibitor, an IFN-alpha/beta receptor glycoprotein, IL-1/TLR signaling inhibitor and the 36kDa major membrane protein.
In summary, our genomic and phylogenomic data informs about the evolutionary trajectory of the 2022 MPVX outbreak variant, shedding light on potential mechanisms and targets of human adaptation. This study also shows that viral genome sequencing might provide enough resolution to track the transmission dynamics and outbreak spread, which seemed to be challenging for a presumably slow-evolving dsDNA virus. Together with the adopted strategy of real-time data sharing, this study might contribute for guiding novel outbreak control measures and subsequent research lines.